Picture for Marco Cuturi

Marco Cuturi

CREST, ENSAE ParisTech

Locking Pretrained Weights via Deep Low-Rank Residual Distillation

Add code
May 11, 2026
Viaarxiv icon

DynaMiCS: Fine-tuning LLMs with Performance Constraints using Dynamic Mixtures

Add code
May 11, 2026
Viaarxiv icon

Nectar: Neural Estimation of Cached-Token Attention via Regression

Add code
May 10, 2026
Viaarxiv icon

The Coupling Within: Flow Matching via Distilled Normalizing Flows

Add code
Mar 09, 2026
Viaarxiv icon

Amortizing Maximum Inner Product Search with Learned Support Functions

Add code
Mar 09, 2026
Viaarxiv icon

The Design Space of Tri-Modal Masked Diffusion Models

Add code
Feb 25, 2026
Viaarxiv icon

LaCy: What Small Language Models Can and Should Learn is Not Just a Question of Loss

Add code
Feb 13, 2026
Viaarxiv icon

Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration

Add code
Dec 26, 2025
Viaarxiv icon

Learning Unmasking Policies for Diffusion Language Models

Add code
Dec 12, 2025
Viaarxiv icon

The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining

Add code
Oct 02, 2025
Figure 1 for The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
Figure 2 for The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
Figure 3 for The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
Figure 4 for The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
Viaarxiv icon